Adaptable Text Filters and Unsupervised Neural Classifiers for Spam Detection

نویسندگان

  • Bogdan Vrusias
  • Ian Golledge
چکیده

Spam detection has become a necessity for successful email communications, security and convenience. This paper describes a learning process where the text of incoming emails is analysed and filtered based on the salient features identified. The method described has promising results and at the same time significantly better performance than other statistical and probabilistic methods. The salient features of emails are selected automatically based on functions combining word frequency and other discriminating matrices, and emails are then encoded into a representative vector model. Several classifiers are then used for identifying spam, and self-organising maps seem to give significantly better results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Self-Organised Map Classifiers as Text Filters for Spam Email Detection

Email communication today is a way of working and communicating for most businesses and public in general. Being able to efficiently receive and send emails therefore becomes a must. Spam email detection and removal then becomes a vital process for the successful email communications, security and convenience. This paper describes a novel way of analysing and filtering incoming emails based on ...

متن کامل

Unsupervised and Supervised Neural Network Learning for Spam Detection

With the rise of technology over the last several decades, spam detection has become an important machine learning problem. Nowadays abundance of labeled email data allows to build automatic systems effectively detecting spam. The majority of these systems are based on supervised machine learning classifiers. Even though some unsupervised systems exist as well, they are much less popular due to...

متن کامل

A New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection

Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...

متن کامل

Trends in Combating Image Spam E-mails

With the rapid adoption of Internet as an easy way to communicate, the amount of unsolicited e-mails, known as spam e-mails, has been growing rapidly. The major problem of spam e-mails is the loss of productivity and a drain on IT resources. Today, we receive spam more rapidly than the legitimate e-mails. Initially, spam e-mails contained only textual messages which were easily detected by the ...

متن کامل

Stacking Classifiers for Anti-Spam Filtering of E-Mail

We evaluate empirically a scheme for combining classifiers, known as stacked generalization, in the context of anti-spam filtering, a novel cost-sensitive application of text categorization. Unsolicited commercial email, or “spam”, floods mailboxes, causing frustration, wasting bandwidth, and exposing minors to unsuitable content. Using a public corpus, we show that stacking can improve the eff...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008